User Collaboration for Improving Access to Historical Texts

نویسندگان

  • Clemens Neudecker
  • Asaf Tzadok
چکیده

The paper will describe how web-based collaboration tools can engage users in the building of historical printed text resources created by mass digitisation projects. The drivers for developing such tools will be presented, identifying the benefits that can be derived for both the user community and cultural heritage institutions. The perceived risks, such as new errors introduced by the users, and the limitations of engaging with users in this way will be set out with the lessons that can be learned from existing activities, such as the National Library of Australia’s newspaper website which supports collaborative correction of Optical Character Recognition (OCR) output. The paper will present the work of the IMPACT (Improving Access to Text) project, a large-scale integrating project funded by the European Commission as part of the Seventh Framework Programme (FP7). One of the aims of the project is to develop tools that help improve OCR results for historical printed texts, specifically those works published before the industrial production of books from the middle of the 19th century. Technological improvements to image processing and OCR engine technology are vital to improving access to historic text, but engaging the user community also has an important role to play. Utilising the intended user can help achieve the levels of accuracy currently found in born-digital materials. Improving OCR results will User Collaboration for Improving Access to Historical Texts 120 Liber Quarterly Volume 20 Issue 1 2010 allow for better resource discovery and enhance performance by text mining and accessibility tools. The IMPACT project will specifically develop a tool that supports collaborative correction and validation of OCR results and a tool to allow user involvement in building historical dictionaries which can be used to validate word recognition. The technologies use the characteristics of human perception as a basis for error detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the Effect of Empirical Interventions on Improving the Historical Urban Texts of Iran and the World A Comparative Study of Historical Cities of Birjand in Iran and Pensicola in Spain

The purpose of this research was Historical Urban Texture Study based on the Experiences of Intervention of Birjand in accordance with the historic city of Pennsylvania. Therefore, in this study, we had uses of library study. In this research, according to the Delphi method, three stages of field survey were prepared. The first stage of open field study is considered to be all effective factors...

متن کامل

Visual analysis and exploration of ancient texts with a user-driven concept search

In the last decades, the amount of digital data has grown rapidly. This also impacted the humanities field where the humanists and computer scientists collaborate to digitize historical texts for preservation purpose. Due to this effort, many historical texts that were accessible to few scholars became widely available in form of digital texts. The focus shifted from printed to digital texts. T...

متن کامل

The Effect of User-Friendly Texts vs. Impersonal and Hybrid Texts on the Reading Comprehension Ability of Iranian EFL Learners

     This study focuses on the effect of user-friendly, impersonal, and hybrid texts on the reading comprehension ability of Iranian foreign language learners. Forty-five students of AlzahraUniversity were selected on the basis of their performance in a recent TOEFL. They were given three different texts (each group of 15 students was given one type) describing the same area of English usage, w...

متن کامل

Improving Access to Digitized Historical Newspapers with Text Mining, Coordinated Models, and Formative User Interface Design

Most tools for accessing digitized historical newspapers emphasize relatively simple search; but, as increasing numbers of digitized historical newspapers and other historical resources become available, we can consider much richer modes of interaction with these collections. For instance, users might use exploratory search for looking at larger issues and events such as elections and campaigns...

متن کامل

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010